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1. Introduction, Data and Descriptive analysis 


This paper aims at analyzing the relationship between the university performances of fresh- 
man students, measured by the University Credits (CUs)! gathered during the first semester, the 
results achieved in T.E.L.E.MA.CO. (TEst di Logica E MAtematica e COmprensione verbale) 
test and their social-demographic characteristics. Starting from the Bologna Declaration of 
1999 (ministerial decree of November 3, 1999, no. 509), the Italian university system has seen 
important changes at the organizational, educational, and financial levels. The training credit 
model was introduced for harmonizing national and international university systems. Another 
change of major importance in the reform consisted of the reorganization of degree courses into 
homogeneous classes. The reform established a three-cycle higher education system compris- 
ing undergraduate (3-years bachelor’s degrees), master’s or specialist degrees (2-years master 
equivalent degrees), and doctoral studies. The education system also provides for the possi- 
bility of attending other courses such as first and second-level masters. Furthermore, in 2004 
non-selective admission tests were introduced for all bachelor’s degrees. 

The Department of Economics and Business Studies (DIEC) of the University of Genoa 
(Italy), which has open-enrolment courses, adopted TE.L.E.MA.CO. test, a very important tool 
for verifying initial knowledge considered functional to the effective participation of a university 
course. It consists of two sections: acommon core for all degree programs, aimed at proving the 
basic skills of comprehension of Italian texts (literacy), and logical reasoning skills (numeracy), 
and a differentiated section according to the chosen program”. Additional mandatory tasks will 
be assigned to students who gain a score lower than the established thresholds. 

Data are collected by the DIEC. The main dataset derives from three different sources: the 
first one contains information related to sociodemographic characteristics and students’ educa- 
tional backgrounds; the second one is about information relating to the university career; the 
last one concerns the results of the TE.L.E.MA.CO admission test. The main dataset records in- 
formation on 488 students enrolled in the Department of Economics of the University of Genoa; 
they are all pure freshmen (first matriculation in the university) and not exempted from the obli- 
gation to take the test?. The considered attributes are age, gender, high school, diploma grade, 
course of study, results of T.E.L.E.MA.CO. test, and average number of CUs. 

Once the main dataset has been assembled, we performed a descriptive analysis of the stu- 
dents’ characteristics. The average age of the students is 19 years, the females represent 31% 
of the sample. 55% of students are enrolled in Business Administration, 27% in Economics of 
Maritime Business, Logistics and Transport, and 18% in Economics. The average high school 
final grade is 74.78, and 25% of the students have a grade higher than 81. Women in the sample 


'CUs represent indicators that measure the workload required to attend the lessons and prepare for the specific 
exam. 

Students pass the common core if and only if obtain a score equal to or higher than 12 out of 20. Then, those 
who have passed the common core and who have achieved a score equal to or greater than 6 in the individual 
sections (literacy and numeracy) can access the T (text) and M (mathematical) extensions respectively. 

3Students who are exempted are students who have achieved a high school final score 
equal to or greater than 90/100 or in other peculiar situations listed at the following link: 
https: / /unige.it/studenti/telemaco#cosaTELEMACO. 
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are on average better than males in terms of high school final grades: female students have 
a mean equal to 76.75, while men’s one is equal to 73.88. A t-test confirmed a significant 
difference on average between the two groups. 

To have a whole picture of the scenario, it is interesting to deepen into how the unalike 
performances are related to the different types of high schools. Table 1 shows the frequency 
distribution of students’ high school and university performances by school of origin. High 
school performances are measured by the average grade of diploma, while university perfor- 
mances by the number of CUs gained during the first semester and by the average score of the 
Common Core Score (named CC Score in Table 1) in the TE.L.E.MA.CO. test. It is worth 
noting that about 40% of students enrolled in Economics in the year 2021 come from the scien- 
tific high school, followed by the technical institute with 30% and the vocational institute and 
linguistic high school with 9%. Regarding the TE.L.E.MA.CO. test results, 346 students out of 
488 students have been successful: 65% of the total girls and 74% of the total males who do the 
test, pass it. Focusing on the sample distribution of the scores gained by students grouped by 
gender in the common core of the TE.L.E.MA.CO. test, there are no gender gaps in the scores 
obtained in the literacy section; on the other hand, differences emerge in the scores in the nu- 
meracy section. If there is a gap in favor of females relating to high school performances, the 
scenario tips up and male students perform better than females in the numeracy section, a result 
that has been confirmed with a t-test. These two results may be consistent. Indeed, we do not 
know if the differences in STEM® subjects performances (which occur in our sample for the nu- 
meracy section) in favor of males also exist in the grades of the high school STEM tests or not. 
On average, we know that females get higher graduation marks, but we do not know what their 
performance in STEM subjects is. It should be considered that our sample examines students 
who must necessarily take the test (therefore not the best in terms of school performance) and 
that the male-female ratio which comes from scientific high schools (students with a stronger 
propensity in scientific subjects) is very high, compared to other institutes. There is therefore 
certainly a problem with sample selection and balancing, which does not allow us to interpret 
the problem of the gender gap exhaustively and completely. 


Table 1: Distribution of students’ high school and university performances by school of origin 


School Type Number of students Female Male Average grade CC Score CUs 
Other types 13 4 9 73.78 11.92 10.38 
Vocational Institute 43 19 24 75.21 11.02 5.44 
Technical Institute 145 44 101 76.23 12.48 11.98 
Classical High School 24 6 18 77.08 14.17 13.88 
High School for Human Sciences 23 12 11 75.96 12.70 7.04 
Linguistic High School 43 25 18 75.79 13.93 12.14 
Scientific High School 197 43 154 73.04 14.20 16.99 


Focusing on the students’ background, Figure 1 shows that on average people who come 
from scientific and classical high schools perform better than the others in all the sections, and 
on average students who attended vocational or other types of high school (such as music or 
artistic high school) do not pass the common core of the TE.L.E.MA.CO. test. Students who 
come from the scientific school perform much better than others, even compared to the students 
of the classic school, with regard to the extension of mathematics. Finally, we examine the 
performance of students during the first semester by looking at the number of CUs (which 
ranges from 0 to 27); 33% of students do not pass any exams (0 CUs), while 28% reach 27 CUs 


4Moreover, 253 students pass the mathematical extension: 43% of the total girls and 55% of the total males 
who do it, pass it. No one is allowed to do the text extension. 

>This result is consistent with the literature about the gender gap in STEM courses (Priulla et al., 2021). 

®STEM is an acronym for the fields of science, technology, engineering, and math. 
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threshold. The number of CUs earned at the end of the first semester follows the same trend for 
both male and female students. Looking at the backgrounds, students from vocational institutes, 
human sciences, or other types of high schools perform worse, while people with scientific and 
classical backgrounds earn a greater number of CUs. 


Figure 1: TE.L.E.MA.CO. test scores’ distribution per school type 
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Source: Computed on the basis of data from DIEC, 2021 


2. Empirical Model 


In this section, we perform two different models, a logistic and an ordered logistic, to study 
the probability of acquiring CUs. These approaches are useful to understand when and how 
timely policies and programs can be implemented to avoid losing students, a frequent trend, 
especially in the first semester of the first year. Specifically, the main goal of the logit model 
is to represent the probability of getting at least 18 CUs’ during the first semester, with respect 
to students’ characteristics and their TE.L.E.MA.CO. test results. This model and the idea of 
expressing the dependent variable as a dummy depend on the fact that, after only a few months 
from the start of a university career, a student has necessarily given few, if any, exams. This 
implies the existence of a minimum number (0) and a maximum number (27) of credits which 
prompted us to consider the exceeding or not of the threshold as a proxy of academic perfor- 
mance. The binary dependent variable is equal to 1 if students gain at least 18 CUs (2 exams) 
at the end of the first semester, and 0 otherwise. The independent variables included in the 
model are the following: gender (dummy variable); age at enrolment®; high school final mark 
(which are normalized from 60 to 100); type of school; university courses; two variables that 
capture the literacy and numeracy scores’; a variable that measures the distance in km between 
home (we use the high school address as a proxy) and the university; and a variable which rep- 
resents the average income in the municipality where they reside, as a proxy of the students’ 
parents income. We suppose that both variables have an important, even if indirect, impact on 
students’ performances. The idea that commuting or changing the habit and home (especially at 


TWe have chosen this threshold because it represents 2 out of 3 exams since in the first semester there are only 
9-credit exams by default. 

The variable is dichotomous in <= 19 and > 19; the dummy assumes the value 0 if the student has a regular 
or early schooling path, otherwise it takes the value 1. 

°We do not consider the mathematical extension score because this variable hides the effects of other covariates, 
even though only a part of the sample accesses the test. 
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the early beginning) may negatively affect performances is widespread in the literature (Tigre et 
al., 2016). Also, socio-economic situations can influence school achievement. The left side of 
Table 2 reports the main results of the logit model (odds ratios, estimated coefficients, standard 
errors, and p-value significance). 


Table 2: Logit and Ordered Logit estimates 


Logit model Ord. Logit model 

OR Coef. SE Signif. OR Coef. SE Signif. 
Inactive — 1 exam 0.138 -1.982 0:223 AEE 
1 exam — 2 exams 0.311 -1.169 0.214 *** 
2 exams — 3 exams 1.058 0.056 0.209 
Intercept (Logit) 3.125 1.139 0.259. *** 
Gender (M) 0.758 -0.277 0.242 0.777 -0.252 0.198 
Age at enrollment 0.861 -0.150 0.274 0.880 -0.127 0.234 
Technical HS 0.455 -0.788 0.272  ** 0.394 -0.933 0.214 *** 
Vocational HS 0.095 -2.354 0.477  *** 0.102 -2.283 0.382 *** 
Classic HS 0.438 -0.826 0.482 0.391 -0.940 0.377 ** 
Linguistic HS 0.259 -1.353 0.381  *** 0.291 -1.233 0:3157 *** 
Humans Sciences HS 0.160 -1.833 0.528 *** 0.121 -2.114 0.451 *** 
Other types 0.345 -1.063 0.673 0.326 -1.121 0.176 *** 
HS grade 1.081 0.078 0.014 *** 1.075 0.072 0.011 *** 
Score Literacy 1.081 0.078 0.070 1.075 0.072 0.060 
Score Numeracy 1.149 0.138 0.050 ** 1.166 0.154 0.042 *** 
Economics of MB 0.995 -0.005 0.249 0.818 -0.201 0.210 
Economics 0.909 -0.095 0.291 0.929 -0.074 0.238 
Distance from home 0.998 -0.002 0.000 X 0.999 -0.001 0.001 
Average Income 1.000 -0.000 0.000 1.000 -0.000 0.000 
Signif. codes 0°*** 0.001°*¥ 0.01°¥ 0.05°" 0.1°? 1 


The baseline student has the following profile: female, who comes from the scientific high 
school, with an age of 19 years at most (therefore regular from the academic point of view), 
with a final grade equal to 74.78 (average diploma grade of the sample) and who has reached 
the average sample results in both literacy and numeracy sections. In addition, this student 
attends Business Administration, has an income equal to the average of the sample, and has a 
zero distance from the university. 

Proceeding with the analysis of the results obtained from the logit regression, the intercept 
shows that for the baseline student the probability to gain at least 18 CUs is 76% and the odds 
ratio is 3.125 with a significance of (with p<0.01). Regarding the school types, we can see that 
students attending different high schools to the scientific one are less likely to obtain the credit 
threshold with a high significance. The Other types high school category, on the other hand, is 
not significant. Another relevant variable is the High School final grade; for a unit increase in 
the final grade, the log odds of CUs increases by 1.081 (with p<0.01). About the admission 
test, we can see that the score achieved in the numeracy section is the only significant: with a 
probability of 53% students who have a score higher than the mean, perform better. Distance 
also has a significant impact on students’ performance: the further away a student is from the 
university, the less likely it is to take two out of three exams. In literature, the role of commuting 
as a penalty in student performance has already been addressed, although not extensively: the 
waste of time associated with the hours of travel, the physical and mental stress of being far 
away, and also the greater difficulty in creating work and friendship groups are certainly some 
of the main components. 

To assess the performance of the logit model we use the area under the receiver operating 
characteristic (ROC) curve (AUC). The AUC value of the logit model is equal to 0.767; since 
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the larger the AUC, the more accurate will be the prediction model, the logit model can be 
considered as sufficiently accurate. Another way to assess the model performances is to examine 
the agreement between actual observations and predictions, through a contingency table. In 
order to transform the student’s predicted probability (probability of obtaining at least 18 CUs) 
into a predicted class (if the student has obtained at least 18 CUs) is sufficient to define a 
specified cut-off probability value. This value is computed using the Youden’s index'® (Youden, 
1950), and it is equal to 0.570, as shown in Figure 2. Finally, we consider the actual and 
predicted classification to measure the goodness of the logit model: the percentage of correctly 
classified is 70%. 


Figure 2: ROC curve 
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Since in the first semester, students have done only 0, 9, 18, or 27 CUs, and every exam has 
the same number of CUs (9) and so the same difficulty, we have decided to perform an ordered 
logistic model trying to capture more information. Also in this model, the dependent variable is 
the students’ performances in terms of CUs, but this time it is measured on an ordinal scale in 4 
categories: 0 exams (inactive) corresponding to 0 credits, 1 exam to 9, 2 to 18, and 3 to 27. The 
right side of Table 2 shows the main results of the ordered logistic model. As we can see, there 
are three estimates of the intercept because, being four the variables, three are the cutoffs from 
one category to another. About the last cutoff, it is worth noting that the third and fourth cate- 
gories (2 exams and 3 exams respectively) are not significantly different, therefore they could 
be aggregated without consequences. Also in this case it is more interesting to comment on the 
coefficients, which confirm the results of the logistic model, even if some differences emerge: 
the variable Other Types becomes significant, and the influence on the dependent variable of 
other covariates (Technical, Classic, Score Numeracy) increases. However, the Distance from 
home loses its significance. Compared to the baseline, set as previously, males rather than fe- 
males, students of other schools than the scientific, and with a lower than average diploma and 
numeracy grades are more likely to obtain fewer CUs. We also perform a Brant test to check 
the hypothesis of parallelism and the test suggests that ordered logit’s regression assumptions 
are met. In addition to the results of ordered logit coefficients, marginal effects are used to pre- 
dict the effect and the magnitude of change. Concerning the high school type, we can see that 
students who came from a high school other than the scientific (model baseline), have a lower 
probability to reach two or three exams; in particular, the probability is much lower for the 
vocational and the human science high schools (in these cases also the likelihood of students to 
get one exam is lower). Furthermore, students who attended classic and technical high schools 
have a higher probability to take at least one exam: for example, a student from a classical 


10The Youden’s index, also called Youden’s J statistic, was developed in 1950 by W.J. Youden and represents a 
single statistic that captures the performance of a dichotomous test. The index considers both the true positive rate 
(Sensitivity), and the true negative rate (Specificity), and it is given by Sensitivity+Specificity-1. 
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high school has a probability of 0.327 of getting two exams higher than a student from human 
sciences. Moreover, if the student’s high school grade or the score in the numeracy section 
increases by one point, then the likelihood of taking zero exams decreases by 1.26% and 2.69% 
respectively. 


3. Conclusions 


The objective of this work was to analyze the relationship between students’ university per- 
formances, measured by the University Credits (CUs) gathered during the first semester, and the 
results achieved in T.E.L.E.MA.CO. test, a useful tool for orientation and access to university 
studies based on solid scientific methodologies, and their social-demographic characteristics. 
A logit and an ordered logit model are used to compute the probabilities to reach at least 18 
CUs (logit) or to obtain 0, 9, 18, and 27 CUs (ordered logit). What emerges from the models 
is that various factors are determinants. About the students’ background, the graduation grade 
and the type of school predict the success at exams (especially in a negative way for vocational, 
linguistic, and human sciences high schools). As for the test, the evaluation of the numeracy 
section is the main determinant of success in performance. Based on a consistent statistical ap- 
proach, our result seems to confirm the ability of the admission test to predict academic success 
in the first year (Bestetti et al., 2020; Migliaretti et al., 2017; Carrieri et al., 2013; CISIA, 2020). 
Furthermore, given the fact that students we consider obtain a diploma grade lower than 90, the 
admission test is also significant in the presence of the high school grade, providing additional 
information that the latter element fails to provide. Also for this reason the test can be a power- 
ful tool and a good alternative to the high school final mark as a university admission indicator, 
often the only information used. It would be interesting as future work to understand if addi- 
tional and perhaps differentiated approaches are necessary according to the background of each 
student, especially at the beginning of their university careers. In addition, hybrid solutions for 
distance and face-to-face teaching could be implemented to facilitate off-site students. 
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